28 research outputs found
Two-Loop Corrections to Large Order Behavior of Theory
We consider the large order behavior of the perturbative expansion of the
scalar field theory in terms of a perturbative expansion around an
instanton solution. We have computed the series of the free energy up to
two-loop order in two and three dimension. Topologically there is only an
additional Feynman diagram with respect to the previously known one dimensional
case, but a careful treatment of renormalization is needed. The propagator and
Feynman diagrams were expressed in a form suitable for numerical evaluation. We
then obtained explicit expressions summing over distinct eigenvalues
determined numerically with the corresponding eigenfunctions.Comment: 12 pages, 2 figure
High-dimensional manifold of solutions in neural networks: insights from statistical physics
In these pedagogic notes I review the statistical mechanics approach to
neural networks, focusing on the paradigmatic example of the perceptron
architecture with binary an continuous weights, in the classification setting.
I will review the Gardner's approach based on replica method and the derivation
of the SAT/UNSAT transition in the storage setting. Then, I discuss some recent
works that unveiled how the zero training error configurations are
geometrically arranged, and how this arrangement changes as the size of the
training set increases. I also illustrate how different regions of solution
space can be explored analytically and how the landscape in the vicinity of a
solution can be characterized. I give evidence how, in binary weight models,
algorithmic hardness is a consequence of the disappearance of a clustered
region of solutions that extends to very large distances. Finally, I
demonstrate how the study of linear mode connectivity between solutions can
give insights into the average shape of the solution manifold.Comment: 22 pages, 9 figures, based on a set of lectures done at the "School
of the Italian Society of Statistical Physics", IMT, Lucc
Plastic number and possible optimal solutions for an Euclidean 2-matching in one dimension
In this work we consider the problem of finding the minimum-weight loop cover
of an undirected graph. This combinatorial optimization problem is called
2-matching and can be seen as a relaxation of the traveling salesman problem
since one does not have the unique loop condition. We consider this problem
both on the complete bipartite and complete graph embedded in a one dimensional
interval, the weights being chosen as a convex function of the Euclidean
distance between each couple of points. Randomness is introduced throwing
independently and uniformly the points in space. We derive the average optimal
cost in the limit of large number of points. We prove that the possible
solutions are characterized by the presence of "shoelace" loops containing 2 or
3 points of each type in the complete bipartite case, and 3, 4 or 5 points in
the complete one. This gives rise to an exponential number of possible
solutions scaling as p^N , where p is the plastic constant. This is at variance
to what happens in the previously studied one-dimensional models such as the
matching and the traveling salesman problem, where for every instance of the
disorder there is only one possible solution.Comment: 19 pages, 5 figure
Exact value for the average optimal cost of bipartite traveling-salesman and 2-factor problems in two dimensions
We show that the average cost for the traveling-salesman problem in two
dimensions, which is the archetypal problem in combinatorial optimization, in
the bipartite case, is simply related to the average cost of the assignment
problem with the same Euclidean, increasing, convex weights. In this way we
extend a result already known in one dimension where exact solutions are
avalaible. The recently determined average cost for the assignment when the
cost function is the square of the distance between the points provides
therefore an exact prediction for
large number of points . As a byproduct of our analysis also the loop
covering problem has the same optimal average cost. We also explain why this
result cannot be extended at higher dimensions. We numerically check the exact
predictions.Comment: 5 pages, 3 figure
Selberg integrals in 1D random Euclidean optimization problems
We consider a set of Euclidean optimization problems in one dimension, where
the cost function associated to the couple of points and is the
Euclidean distance between them to an arbitrary power , and the points
are chosen at random with flat measure. We derive the exact average cost for
the random assignment problem, for any number of points, by using Selberg's
integrals. Some variants of these integrals allows to derive also the exact
average cost for the bipartite travelling salesman problem.Comment: 9 pages, 2 figure
The Random Fractional Matching Problem
We consider two formulations of the random-link fractional matching problem,
a relaxed version of the more standard random-link (integer) matching problem.
In one formulation, we allow each node to be linked to itself in the optimal
matching configuration. In the other one, on the contrary, such a link is
forbidden. Both problems have the same asymptotic average optimal cost of the
random-link matching problem on the complete graph. Using a replica approach
and previous results of W\"{a}stlund [Acta Mathematica 204, 91-150 (2010)], we
analytically derive the finite-size corrections to the asymptotic optimal cost.
We compare our results with numerical simulations and we discuss the main
differences between random-link fractional matching problems and the
random-link matching problem.Comment: 24 pages, 3 figure
Typical and atypical solutions in non-convex neural networks with discrete and continuous weights
We study the binary and continuous negative-margin perceptrons as simple
non-convex neural network models learning random rules and associations. We
analyze the geometry of the landscape of solutions in both models and find
important similarities and differences. Both models exhibit subdominant
minimizers which are extremely flat and wide. These minimizers coexist with a
background of dominant solutions which are composed by an exponential number of
algorithmically inaccessible small clusters for the binary case (the frozen
1-RSB phase) or a hierarchical structure of clusters of different sizes for the
spherical case (the full RSB phase). In both cases, when a certain threshold in
constraint density is crossed, the local entropy of the wide flat minima
becomes non-monotonic, indicating a break-up of the space of robust solutions
into disconnected components. This has a strong impact on the behavior of
algorithms in binary models, which cannot access the remaining isolated
clusters. For the spherical case the behaviour is different, since even beyond
the disappearance of the wide flat minima the remaining solutions are shown to
always be surrounded by a large number of other solutions at any distance, up
to capacity. Indeed, we exhibit numerical evidence that algorithms seem to find
solutions up to the SAT/UNSAT transition, that we compute here using an 1RSB
approximation. For both models, the generalization performance as a learning
device is shown to be greatly improved by the existence of wide flat minimizers
even when trained in the highly underconstrained regime of very negative
margins.Comment: 34 pages, 13 figure
Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures
We analyze the connection between minimizers with good generalizing properties and high local entropy regions of a threshold-linear classifier in Gaussian mixtures with the mean squared error loss function. We show that there exist configurations that achieve the Bayes-optimal generalization error, even in the case of unbalanced clusters. We explore analytically the error-counting loss landscape in the vicinity of a Bayes-optimal solution, and show that the closer we get to such configurations, the higher the local entropy, implying that the Bayes-optimal solution lays inside a wide flat region. We also consider the algorithmically relevant case of targeting wide flat minima of the (differentiable) mean squared error loss. Our analytical and numerical results show not only that in the balanced case the dependence on the norm of the weights is mild, but also, in the unbalanced case, that the performances can be improved